IDL Programming > Concepts > String Operations > Substrings

Substrings

IDL provides the STRPOS , STRPUT , and STRMID routines to locate, insert, and extract substrings from their string arguments.

Searching for a Substring

The STRPOS function is used to search for the first occurrence of a substring. It has the form

S = STRPOS(Object, Search_string[, Position])

where Object is the string to be searched, Search_string is the substring to search for, and Position is the character position (starting with position 0) at which the search is begun. If the optional argument Position is omitted, the search is started at the first character (character position 0). The following IDL procedure counts the number of times that the word “dog” appears in the string “dog cat duck rabbit dog cat dog”:

PRO Animals

 

; The search string, "dog", appears three times.

animals = 'dog cat duck rabbit dog cat dog'

 

; Start searching in character position 0.

I = 0

 

; Number of occurrences found.

cnt = 0

 

; Search for an occurrence.

WHILE (I NE -1) DO BEGIN

   I = STRPOS(animals, 'dog', I)

 

   IF (I NE -1) THEN BEGIN

     ; Update counter.

     cnt = cnt + 1

 

     ;I ncrement I so as not to count the same instance of 'dog'

     ; twice.

     I = I + 1

 

   ENDIF

ENDWHILE

 

; Print the result.

PRINT, 'Found ', cnt, " occurrences of 'dog'"

END

Running the above program produces the result below.

Found 3 occurrences of 'dog'

Searching For the Last Occurrence of a Substring

The REVERSE_SEARCH keyword to the STRPOS function makes it easy to find the last occurrence of a substring within a string. In the following example, we search for the last occurrence of the letter “I” (or “i”) in a sentence:

sentence = 'IDL is fun.'

sentence = STRUPCASE(sentence)

lasti = STRPOS(sentence, 'I', /REVERSE_SEARCH)

PRINT, lasti

This results in:

4

Note that although REVERSE_SEARCH tells STRPOS to begin searching from the end of the string, the STRPOS function still returns the position of the search string starting from the beginning of the string (where 0 is the position of the first character).

Inserting the Contents of One String into Another

The STRPUT procedure is used to insert the contents of one string into another. It has the form,

STRPUT, Destination, Source[, Position]

where Destination is the string to be overwritten, Source is the string to be inserted, and Position is the first character position within Destination at which Source will be inserted. If the optional argument Position is omitted, the overwrite is started at the first character (character position 0). The following IDL statements use STRPOS and STRPUT to replace every occurrence of the word “dog” with the word “CAT” in the string “dog cat duck rabbit dog cat dog”:

animals = 'dog cat duck rabbit dog cat dog'

;The string to search, "dog", appears three times.

 

;While any occurrence of "dog" exists, replace it.

WHILE (((I = STRPOS(animals, 'dog'))) NE -1) DO $

STRPUT, animals, 'CAT', I

 

;Show the resulting string.

PRINT, animals

Running the above statements produces the result below.

CAT cat duck rabbit CAT cat CAT

Extracting Substrings

The STRMID function is used for extracting substrings from a larger string. It has the form:

STRMID(Expression, First_Character [, Length])

where Expression is the string from which the substring will be extracted, First_Character is the starting position within Expression of the substring (the first position is position 0), and Length is the length of the substring to extract. If there are not Length characters following the position First_Character, the substring will be truncated. If the Length argument is not supplied, STRMID extracts all characters from the specified starting position to the end of the string. The following IDL statements use STRMID to print a table matching the number of each month with its three-letter abbreviation:

; String containing all the month names.

months = 'JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC'

 

; Extract each name in turn. The equation (I-1)*3 calculates the

; position within MONTH for each abbreviation

FOR I = 1, 12 DO PRINT, I, '      ', $

STRMID(months, (I - 1) * 3, 3)

The result of executing these statements is as follows:

 1      JAN

 2      FEB

 3      MAR

 4      APR

 5      MAY

 6      JUN

 7      JUL

 8      AUG

 9      SEP

10      OCT

11      NOV

12      DEC